Sharpened Error Bounds for Random Sampling Based $\ell_2$ Regression

نویسنده

Shusen Wang

چکیده

Abstract. Given a data matrix X ∈ R and a response vector y ∈ R , suppose n > d, it costs O(nd) time and O(nd) space to solve the least squares regression (LSR) problem. When n and d are both large, exactly solving the LSR problem is very expensive. When n ≫ d, one feasible approach to accelerating LSR is to randomly embed y and all columns of X into the subspace R where c ≪ n; the induced LSR problem has the same number of columns but much fewer number of rows, and the induced problem can be solved in O(cd) time and O(cd) space. The leverage scores based sampling is an effective subspace embedding method and can be applied to accelerate LSR. It was shown previously that c = O(dǫ log d) is sufficient for achieving 1 + ǫ accuracy. In this paper we sharpen this error bound, showing that c = O(d log d + dǫ) is enough for 1 + ǫ accuracy.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Least-Squares Regression on Sparse Spaces

Another application is when one uses random projections to project each input vector into a lower dimensional space, and then train a predictor in the new compressed space (compression on the feature space). As is typical of dimensionality reduction techniques, this will reduce the variance of most predictors at the expense of introducing some bias. Random projections on the feature space, alon...

متن کامل

Revisiting the Nystrom method for improved large-scale machine learning

We reconsider randomized algorithms for the low-rank approximation of symmetric positive semi-definite (SPSD) matrices such as Laplacian and kernel matrices that arise in data analysis and machine learning applications. Our main results consist of an empirical evaluation of the performance quality and running time of sampling and projection methods on a diverse suite of SPSD matrices. Our resul...

متن کامل

Lecture 15 : Additive - error Low - rank Matrix Approximation with Sampling and Projections

• A spectral norm bound for reconstruction error for the basic low-rank approximation random sampling algorithm. • A discussion of how similar bounds can be obtained with a variety of random projection algorithms. • A discussion of possible ways to improve the basic additive error bounds. • An iterative algorithm that leads to additive error with much smaller additive scale. This will involve u...

متن کامل

Online Active Linear Regression via Thresholding

We consider the problem of online active learning to collect data for regression modeling. Specifically, we consider a decision maker with a limited experimentation budget who must efficiently learn an underlying linear population model. Our main contribution is a novel threshold-based algorithm for selection of most informative observations; we characterize its performance and fundamental lowe...

متن کامل

Bayesian Error Based Sequences of Mutual Information Bounds

The inverse relation between mutual information (MI) and Bayesian error is sharpened by deriving finite sequences of upper and lower bounds on MI in terms of the minimum probability of error (MPE) and related Bayesian quantities. The well known Fano upper bound and Feder-Merhav lower bound on equivocation are tightened by including a succession of posterior probabilities starting at the largest...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Sharpened Error Bounds for Random Sampling Based $\ell_2$ Regression

نویسنده

چکیده

منابع مشابه

Least-Squares Regression on Sparse Spaces

Revisiting the Nystrom method for improved large-scale machine learning

Lecture 15 : Additive - error Low - rank Matrix Approximation with Sampling and Projections

Online Active Linear Regression via Thresholding

Bayesian Error Based Sequences of Mutual Information Bounds

عنوان ژورنال:

اشتراک گذاری